AITopics

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.99)

Neural Information Processing SystemsMar-13-2026, 04:55:09 GMT

MacNet: Transferring Knowledge from Machine Comprehension to Sequence-to-Sequence Models

Boyuan Pan, Yazheng Yang, Hao Li, Zhou Zhao, Yueting Zhuang, Deng Cai, Xiaofei He

Machine comprehension (MC) has gained significant popularity over the past few years and it is a coveted goal in the field of natural language understanding. Its task is to teach the machine to understand thecontent ofagivenpassage andthenanswer arelated question, which requires deep comprehension and accurate information extraction towards the text.

etal, machine learning, natural language, (19 more...)

Country:

North America > Canada > Quebec > Montreal (0.04)
Asia > China > Zhejiang Province > Hangzhou (0.04)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.34)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.32)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.30)

Neural Information Processing SystemsNov-20-2025, 23:33:02 GMT

Densely Connected Attention Propagation for Reading Comprehension

Yi Tay, Anh Tuan Luu, Siu Cheung Hui, Jian Su

We conduct extensive experiments on four challenging RC benchmarks.

arxiv preprint arxiv, machine learning, natural language, (14 more...)

Country:

Asia > Singapore (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
North America > Canada > Quebec > Montreal (0.04)
Asia > Middle East > Qatar > Ad-Dawhah > Doha (0.04)

Industry: Education > Assessment & Standards > Student Performance (0.42)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.31)

Neural Information Processing SystemsNov-20-2025, 22:38:07 GMT

MacNet: Transferring Knowledge from Machine Comprehension to Sequence-to-Sequence Models

machine comprehension, sequence-to-sequence model, transferring knowledge, (4 more...)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.99)

Neural Information Processing SystemsNov-20-2025, 18:21:43 GMT

MacNet: Transferring Knowledge from Machine Comprehension to Sequence-to-Sequence Models

Boyuan Pan, Yazheng Yang, Hao Li, Zhou Zhao, Yueting Zhuang, Deng Cai, Xiaofei He

Machine comprehension (MC) has gained significant popularity over the past few years and it is a coveted goal in the field of natural language understanding.

artificial intelligence, machine learning, natural language, (17 more...)

Country:

Asia > Middle East > Lebanon (0.05)
North America > Canada > Quebec > Montreal (0.04)
Asia > China > Zhejiang Province > Hangzhou (0.04)

Industry: Leisure & Entertainment > Sports > Football (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.74)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)

Neural Information Processing SystemsOct-7-2024, 19:13:45 GMT

Reviews: MacNet: Transferring Knowledge from Machine Comprehension to Sequence-to-Sequence Models

Update after Author Feedback: After reading all the reviews and the author feedback, I have two overall comments. The paper is branded as a transfer learning paper, but I'm left disappointed in this respect. I find it very surprising that the attention can be transferred at all, but it is such a small contribution to the MacNet Architecture's overall improvements, that it seems a hard sell. Focal losses have been used before and encoders have been transferred before, but they also contribute to performance improvements... Second comment: the ablations on summarization are necessary for a camera-ready version -- that seems like a hole right now, so I hope they are included in future versions. Overall, I'm still a 6 because you find a combination of things (with some surprising novelty) that improve performance, and it has shown that I should experiment with those things in the future.

macnet, sequence-to-sequence model, summarization, (14 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.39)
Information Technology > Artificial Intelligence > Natural Language (0.33)

arXiv.org Artificial IntelligenceNov-14-2022

Learning to Answer Multilingual and Code-Mixed Questions

Gupta, Deepak

Question-answering (QA) that comes naturally to humans is a critical component in seamless human-computer interaction. It has emerged as one of the most convenient and natural methods to interact with the web and is especially desirable in voice-controlled environments. Despite being one of the oldest research areas, the current QA system faces the critical challenge of handling multilingual queries. To build an Artificial Intelligent (AI) agent that can serve multilingual end users, a QA system is required to be language versatile and tailored to suit the multilingual environment. Recent advances in QA models have enabled surpassing human performance primarily due to the availability of a sizable amount of high-quality datasets. However, the majority of such annotated datasets are expensive to create and are only confined to the English language, making it challenging to acknowledge progress in foreign languages. Therefore, to measure a similar improvement in the multilingual QA system, it is necessary to invest in high-quality multilingual evaluation benchmarks. In this dissertation, we focus on advancing QA techniques for handling end-user queries in multilingual environments. This dissertation consists of two parts. In the first part, we explore multilingualism and a new dimension of multilingualism referred to as code-mixing. Second, we propose a technique to solve the task of multi-hop question generation by exploiting multiple documents. Experiments show our models achieve state-of-the-art performance on answer extraction, ranking, and generation tasks on multiple domains of MQA, VQA, and language generation. The proposed techniques are generic and can be widely used in various domains and languages to advance QA systems.

large language model, machine learning, question answering, (27 more...)

2211.07522

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.27)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.13)
Asia > India > Himachal Pradesh > Shimla (0.04)
(51 more...)

Genre:

Summary/Review (1.00)
Research Report > New Finding (1.00)
Overview (1.00)
Research Report > Experimental Study (0.92)

Industry:

Media > Film (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Transportation > Ground (0.92)
(6 more...)

Technology:

Information Technology > Artificial Intelligence > Speech (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
(10 more...)

Rayasam, Anirudha, Kamath, Anusha, Kalejaiye, Gabriel Bayomi Tinoco

Modular Approach to Machine Reading Comprehension: Mixture of Task-Aware Experts

arXiv.org Artificial IntelligenceOct-4-2022

In this work we present a Mixture of Task-Aware Experts Network for Machine Reading Comprehension on a relatively small dataset. We particularly focus on the issue of common-sense learning, enforcing the common ground knowledge by specifically training different expert networks to capture different kinds of relationships between each passage, question and choice triplet. Moreover, we take inspi ration on the recent advancements of multitask and transfer learning by training each network a relevant focused task. By making the mixture-of-networks aware of a specific goal by enforcing a task and a relationship, we achieve state-of-the-art results and reduce over-fitting.

artificial intelligence, machine learning, natural language, (18 more...)

2210.0175

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.05)

Genre: Research Report (0.51)

Industry: Education > Assessment & Standards > Student Performance (0.62)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.96)

Zaib, Munazza, Sheng, Quan Z., Zhang, Wei Emma

A Short Survey of Pre-trained Language Models for Conversational AI-A NewAge in NLP

arXiv.org Artificial IntelligenceApr-21-2021

Building a dialogue system that can communicate naturally with humans is a challenging yet interesting problem of agent-based computing. The rapid growth in this area is usually hindered by the long-standing problem of data scarcity as these systems are expected to learn syntax, grammar, decision making, and reasoning from insufficient amounts of task-specific dataset. The recently introduced pre-trained language models have the potential to address the issue of data scarcity and bring considerable advantages by generating contextualized word embeddings. These models are considered counterpart of ImageNet in NLP and have demonstrated to capture different facets of language such as hierarchical relations, long-term dependency, and sentiment. In this short survey paper, we discuss the recent progress made in the field of pre-trained language models. We also deliberate that how the strengths of these language models can be leveraged in designing more engaging and more eloquent conversational agents. This paper, therefore, intends to establish whether these pre-trained models can overcome the challenges pertinent to dialogue systems, and how their architecture could be exploited in order to overcome these challenges. Open challenges in the field of dialogue systems have also been deliberated.

dialogue system, language model, pre-trained language model, (13 more...)

2104.1081

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Oceania > Australia > Victoria > Melbourne (0.06)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
(9 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.91)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.70)

arXiv.org Artificial IntelligenceOct-21-2020

Knowledge Distillation for Improved Accuracy in Spoken Question Answering

You, Chenyu, Chen, Nuo, Zou, Yuexian

Spoken question answering (SQA) is a challenging task that requires the machine to fully understand the complex spoken documents. Automatic speech recognition (ASR) plays a significant role in the development of QA systems. However, the recent work shows that ASR systems generate highly noisy transcripts, which critically limit the capability of machine comprehension on the SQA task. To address the issue, we present a novel distillation framework. Specifically, we devise a training strategy to perform knowledge distillation (KD) from spoken documents and written counterparts. Our work makes a step towards distilling knowledge from the language model as a supervision signal to lead to better student accuracy by reducing the misalignment between automatic and manual transcriptions. Experiments demonstrate that our approach outperforms several state-of-the-art language models on the Spoken-SQuAD dataset.

knowledge distillation, natural language, question answering, (19 more...)

2010.11067

Country:

Asia > China > Guangdong Province > Shenzhen (0.05)
North America > United States (0.04)

Genre: Research Report (0.83)

Industry: Education (0.30)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.87)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.64)